-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: disable flakey tests in the default runs #1120
Conversation
Remaining issue: #1142 (integration tests don't terminate). I wonder if this PR is still an improvement and should be merged... |
Seems that this is also an inconsistent issue, as the second run of the integration tests on this PR did terminate. However, it terminated with failures in two further tests! These tests didn't fail once in the 30 runs I did previously. If it was just one additional test I'd assume you just got lucky and hit a 1/40 chance failure -- after all, it's pretty much even odds whether such an event would show up in a survey of just 30 runs. Getting two independent failures with such low probability is pretty unlikely though, so what's going on?
For now I'm going to run the integration tests again here and also sample another 20 or so runs on main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me as an interim solution, while we're working on improving the tests.
I don't think that we should go forward with this as is, since it doesn't actually achieve the goal of making CI pass, due to:
But if we can get CI to consistently pass by this approach of disabling apparently flakey tests then I am in favour. |
20 more runs on main identifies 6 additional flakey tests -- table of just new failing tests below, and full table here. I think we should split the integration tests up into different tox environments (by file, I guess), so that they can be run serially in under the time limit for github actions. That way they should at least be (edit: more!) deterministic. commit=50b42d013aee01536416e6334f99443f2b4f1e4c
|
Update: running the tests serially seems to eliminate a lot of flakiness! See PR and test output here.
|
Going to close in favour of #1143 in a while. |
Closed in favour of #1149 |
Ref: #1108 (comment)